AITopics | cost constraint

Collaborating Authors

cost constraint

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Incentive-Aware Dynamic Resource Allocation under Long-Term Cost Constraints

Neural Information Processing SystemsJun-19-2026, 05:53:41 GMT

Motivated by applications such as cloud platforms allocating GPUs to users or governments deploying mobile health units across competing regions, we study the constrained dynamic allocation of a reusable resource to a group of strategic agents. Our objective is to simultaneously (i) maximize social welfare, (ii) satisfy multidimensional long-term cost constraints, and (iii) incentivize truthful reporting. We begin by numerically evaluating primal-dual methods widely used in constrained online optimization and find them to be highly fragile in strategic settings - agents can easily manipulate their reports to distort future dual updates for future gain. To address this vulnerability, we develop an incentive-aware framework that makes primal-dual methods robust to strategic behavior. Our primal-side design combines epoch-based lazy updates - discouraging agents from distorting dual updates - with dual-adjust pricing and randomized exploration techniques that extract approximately truthful signals for learning. On the dual side, we design a novel online learning subroutine to resolve a circular dependency between actions and predictions; this makes our mechanism achieve eO( T)social welfare regret (where T is the number of allocation rounds), satisfies all cost constraints, and ensures incentive alignment. This eO( T) performance matches that of non-strategic allocation approaches while additionally exhibiting robustness to strategic agents.

agent, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Services (0.87)
Education > Educational Setting > Online (0.35)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CHPO: Constrained Hybrid-action Policy Optimization for Reinforcement Learning

Neural Information Processing SystemsJun-17-2026, 15:59:28 GMT

Constrained hybrid-action reinforcement learning (RL) promises to learn a safe policy within a parameterized action space, which is particularly valuable for safety-critical applications involving discrete-continuous hybrid action spaces. However, existing hybrid-action RL algorithms primarily focus on reward maximization, which faces significant challenges for tasks involving both cost constraints and hybrid action spaces. In this work, we propose a novel Constrained Hybrid-action Policy Optimization algorithm (CHPO) to address the problems of constrained hybrid-action RL. Concretely, we rethink the limitations of hybridaction RL in handling safe tasks with parameterized action spaces and reframe the objective of constrained hybrid-action RL by introducing the concept of Constrained Parameterized-action Markov Decision Process (CPMDP). Subsequently, we present a constrained hybrid-action policy optimization algorithm to confront the constrained hybrid-action problems and conduct theoretical analyses demonstrating that the CHPO converges to the optimal solution while satisfying safety constraints. Finally, extensive experiments demonstrate that the CHPO achieves competitive performance across multiple experimental tasks. Our code is available at github.CHPO.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.68)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Small Total-Cost Constraints in Contextual Bandits with Knapsacks, with Application to Fairness

Neural Information Processing SystemsFeb-15-2026, 22:13:41 GMT

The setting described is summarized in Box A.

artificial intelligence, constraint, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre:

Research Report > New Finding (0.46)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)

Add feedback

Monte-Carlo Tree Search for Constrained POMDPs

Jongmin Lee, Geon-hyeong Kim, Pascal Poupart, Kee-Eung Kim

Neural Information Processing SystemsFeb-12-2026, 20:45:58 GMT

Neural Information Processing Systems http://nips.cc/

cc-pomcp, cost constraint, cpomdp, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

af5d5ef24881f3c3049a7b9bfe74d58b-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 20:36:43 GMT

algorithm, constraint, international conference, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre:

Research Report (0.69)
Overview (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Prompt Optimization as a State-Space Search Problem

Taneja, Maanas

arXiv.org Artificial IntelligenceNov-25-2025

Language Models are extremely susceptible to performance collapse with even small changes to input prompt strings. Libraries such as DSpy (from Stanford NLP) avoid this problem through demonstration-based prompt optimisation. Inspired by this, I propose an alternative approach that treats prompt optimisation as a classical state-space search problem. I model the prompt space as a graph where nodes represent prompt states and edges correspond to deliberate transformations such as shortening, adding examples, or re- ordering content. Using beam search and random walk algorithms, I systematically explore this space, evaluating candidates on development sets and pruning unpromising branches. Across five NLP tasks (sentiment classification, question answering, summarisation, reason- ing, and natural language inference), I find that even shallow search configurations (beam width=2, depth=2) improve upon seed prompts on development sets. For instance, beam search achieves development accuracy gains from 0.40 to 0.80 on reasoning tasks, though test set improvements are more modest (0.20 to 0.50), indicating overfitting to the develop- ment heuristic. Analysis of successful optimisation paths reveals that transformations that make prompts concise appear most frequently, while verbosity operators are never selected. My results validate prompt optimization as a search problem and suggest that with greater computational resources and improved evaluation metrics, deeper exploration could yield more robust prompts that generalize beyond development sets. Code and implementation are available at [https://github.com/MaanasTaneja/PromptOptimiser].

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.18619

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Monte-Carlo Tree Search for Constrained POMDPs

Jongmin Lee, Geon-hyeong Kim, Pascal Poupart, Kee-Eung Kim

Neural Information Processing SystemsNov-20-2025, 16:32:04 GMT

However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural.

artificial intelligence, machine learning, planning & scheduling, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Online Optimization for Offline Safe Reinforcement Learning

Chemingui, Yassine, Deshwal, Aryan, Fern, Alan, Nguyen-Tang, Thanh, Doppa, Janardhan Rao

arXiv.org Machine LearningOct-28-2025

We study the problem of Offline Safe Reinforcement Learning (OSRL), where the goal is to learn a reward-maximizing policy from fixed data under a cumulative cost constraint. We propose a novel OSRL approach that frames the problem as a minimax objective and solves it by combining offline RL with online optimization algorithms. We prove the approximate optimality of this approach when integrated with an approximate offline RL oracle and no-regret online optimization. We also present a practical approximation that can be combined with any offline RL algorithm, eliminating the need for offline policy evaluation. Empirical results on the DSRL benchmark demonstrate that our method reliably enforces safety constraints under stringent cost budgets, while achieving high rewards. The code is available at https://github.com/yassineCh/O3SRL.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

2510.22027

Country:

North America > United States > Washington (0.04)
North America > United States > Oregon (0.04)
North America > United States > New Jersey (0.04)
North America > United States > Minnesota (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Proof of the strong duality (4) In this section, we explain why the equalities (4) hold when the problem (r, c, B

Neural Information Processing SystemsOct-9-2025, 01:36:59 GMT

The first and third equalities are straightforward. We restate a result extracted from the monograph by Luenberger [1969]. It relies on the dual functional φ, whose expression we recall below. Theorem 2 (stated as Theorem 1 in Section 8.6, page 224 in Luenberger, 1969) . " is required to apply the theorem.

artificial intelligence, constraint, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: